library(dartRverse)
library(ggplot2)
library(dismo)
library(leaflet)
library(htmltools)
library(leaflegend)
library(sf)
library(raster)8 Landscape Genetics
Session Presenters

Required packages
make sure you have the packages installed, see Install dartRverse
Introduction
Landscape genetics combines landscape ecology with population genetics to understand how the structure, composition and configuration of the landscape affects gene flow, genetic drift, and selection.
In this tutorial, we will dive into the first question - how does the landscape influence gene flow? There are many ways to tackle this question, for example, we can explore gene flow at the individual or population level, over different spatial scales, and/or across a meta-population.
Here we’ll start with the basics, by looking at the spatial distribution of individuals, and build on this until we incorporate landscape features, to understand:
Different dispersal strategies; and
Limitations to gene flow, including:
a) Geographic distance (isolation-by-distance);
b) Geographic barriers (isolation-by-barrier);
c) Landscape attributes (isolation-by-resistance).
1. Dispersal strategies
Many biological and demographic processes can shape patterns of genetic structure, for example, dispersal strategies, mating systems, and social behaviours. These processes often occur over a fine scale. As such, exploring patterns within populations over 10s-100s of metres is often very informative, especially in small and inconspicuous species.
Before diving into more complex and correlative analyses, it is useful to understand some of the baseline life-history and demographic traits in your species. One common question is: how far do individuals disperse and is this the same or different for males and females?
When one sex disperses further than the other, this is known as sex-biased dispersal. This can be detected via genetic spatial autocorrelation analysis. We’ll describe this in more detail later on, but in short, increased philopatry by one sex results in greater genetic similarity among neighbouring individuals, while genetic similarity is reduced in the dispersing sex. You can also learn more about it in Banks & Peakall 2012 (and references therein).
Natal philopatry describes when an animal remains close or returns to the area where they were born.
To explore the genetic patterns that result from sex-biased dispersal, we’re going to simulate this process in a hypothetical animal population. First, we need to set up the dispersal kernels, using the custom function below.
# Create a function, where:
# x = minimum to maximum distance,
# d0 = mean distance,
# p= proportion that go to mean
p2p <- function(x, d0, p)
{
return (exp(((x/d0)*log(p))))
}
# Dispersal kernels:
# Females
dispfemale <- (p2p(x = 1:50, # range of distances
d0 = 1, # mean dispersal distance
p = 0.5)) # proportion that go to mean
pdispfemale <- c(0, cumsum(dispfemale)/sum(dispfemale))
# Males
dispmale <- (p2p(x = 1:50, # range of distances
d0 = 20, # mean dispersal distance
p = 0.5)) # proportion that go to mean
pdispmale <- c(0, cumsum(dispmale)/sum(dispmale))Let’s look at the dispersal curves. In this case, males disperse further than females (i.e., there is male-biased dispersal):
# Plot
plot(dispfemale,
type = "l", # plot line
col = '#E69F00', # line colour
lwd = 2,
main = "Female dispersal curve",
xlab = "Distance (m)",
ylab = "")plot(dispmale,
type = "l", # plot line
col = '#5D3FD3', # line colour
lwd = 2,
main = "Male dispersal curve",
xlab = "Distance (m)",
ylab = "")

Next we’ll create a function that generates a population using glSim, which simulates simple SNP data and returns a genlight object. In our hypothetical population, females randomly mate with the nearest male, produce two offspring at a 50:50 sex-ratio, and the offspring then disperse following the dispersal probabilities created above.
# Create a function, where: Nind = number of individuals, Nsnp = number
# of SNPs, pdispF = female dispersal probability (generated above)
# pdispM = male dispersal probability (generated above) seed = set the
# seed so simulation is repeatable
SimDisp <- function(Nind, Nsnp, pdispF, pdispM, seed) {
set.seed(seed)
# Simulate a genlight object
gg <- glSim(n.ind = Nind, n.snp.nonstruc = Nsnp, grp.size = c(0.999,
0.001), ploidy = 2, k = 1, LD = FALSE, ind.names = paste(sprintf("ind%03d",
1:Nind), sep = ""), snp.names = paste(sprintf("snp%03d", 1:Nsnp),
sep = ""))
# Define pop
pop(gg) <- rep("A", Nind)
# Create all the other parameters
gg <- gl.compliance.check(gg)
# Define sex using a sex ratio of 0.5
ds <- c(rep("F", 0.5 * Nind), rep("M", (1 - 0.5) * Nind))
ds <- ds[order(runif(length(ds)))]
gg@other$ind.metrics$sex <- factor(ds)
# Set coordinates
xy <- expand.grid(x = 1:100, y = 1:100)
# sample from the grid
xys <- xy[sample(1:nrow(xy), Nind, replace = FALSE), ]
gg@other$ind.metrics$x <- xys$x
gg@other$ind.metrics$y <- xys$y
for (ii in 1:20) {
# Run for 20 generations
# Find mating pairs & reproduce
females <- which(gg@other$ind.metrics$sex == "F")
males <- which(gg@other$ind.metrics$sex == "M")
off <- list()
for (i in 1:length(females)) {
mfemale <- gg[females[i], ]
fxy <- c(gg@other$ind.metrics$x[females[i]], gg@other$ind.metrics$y[females[i]])
mxy <- cbind(gg@other$ind.metrics$x[males], gg@other$ind.metrics$y[males])
xxyy <- rbind(fxy, mxy)
# Find closest mating male
chosenm <- which.min(as.matrix(dist(xxyy))[1, -1])
mmale <- gg[males[chosenm], ]
doff <- gl.sim.offspring(mmale, mfemale, sexratio = 0.5, noffpermother = 2) # 2 offspring
doff@other$ind.metrics <- data.frame(sex = factor(c("F", "M")),
x = mfemale@other$ind.metrics$x, y = mfemale@other$ind.metrics$y)
off[[i]] <- doff
}
gg <- do.call(rbind, off)
# Make all offspring adults
xx <- (lapply(off, function(x) x@other$ind.metrics))
xx <- do.call(rbind, xx)
gg@other$ind.metrics <- xx
# Emigrate depending on dispersial distance
for (i in 1:nInd(gg)) {
# Use dispersal curves generated above to determine
# distance
if (gg@other$ind.metrics$sex[i] == "M")
dis <- max(which(runif(1) > pdispM)) else dis = max(which(runif(1) > pdispF))
# Dispersal direction
dir <- runif(1, 0, 2 * pi)
# Assign new coordinates
gg@other$ind.metrics$x[i] <- gg@other$ind.metrics$x[i] + dis *
cos(dir)
gg@other$ind.metrics$y[i] <- gg@other$ind.metrics$y[i] + dis *
sin(dir)
}
# Use torus to determine what to do with individuals that
# disperse outside of extent
gg@other$ind.metrics$x <- gg@other$ind.metrics$x%%100
gg@other$ind.metrics$y <- gg@other$ind.metrics$y%%100
# Plot
plot(gg@other$ind.metrics$x, gg@other$ind.metrics$y, pch = 16, cex = 1,
col = c("#E69F00", "#5D3FD3")[as.numeric(gg@other$ind.metrics$sex)],
asp = 1, main = paste("Generation", ii), xlab = "x", ylab = "y")
legend("bottomleft", legend = c("Female", "Male"), col = c("#E69F00",
"#5D3FD3"), pch = 16, cex = 1)
}
# Output genlight object containing simulated data
return(gg)
}Now run the function to simulate a population with 100 individuals, 1000 SNPs, and male-biased dispersal.
The simulation goes for 20 generations - you can see an animation below
simdat <- SimDisp(Nind = 100, Nsnp = 1000, pdispF = pdispfemale, pdispM = pdispmale,
seed = 123)
What does this mean for fine-scale genetic structure? Can we identify different dispersal patterns for males and females? Let’s run genetic spatial autocorrelation to find out. We’ll use the function gl.spatial.autoCorr. This is a multivariate approach that combines all loci into a single analysis (Smouse & Peakall 1999; Peakall et al. 2003; Banks & Peakall 2012). The autocorrelation coefficient “r” is calculated for each pair of individuals in each specified distance class (called “bins” in this function).
We’ll use evenly distributed bins and compare individuals at 10 m intervals up to 50 m.
There are two ways to test whether fine-scale genetic structure is significantly positive (i.e., individuals are genetically similar) or negative (i.e., individuals are genetically dissimilar).
The first approach is a one-tail permutation test, which provides 95% null hypothesis confidence regions. If the autocorrelation coefficient “r” falls outsdide of this range, significant fine-scale spatial genetic structure has been detected.
The second approach is to estimate 95% bootstrap confidence intervals (CIs) around the value for r. These are obtained by drawing (with replacement) from within the set of relevant pairwise comparisons for a given distance class. If bootstrap CIs overlap zero, there is no fine-scale spatial genetic structure present. Bootstrap CIs also allow you to test for significant differences between groups (e.g., between females and males). When CIs are non-overlapping, there is a significant difference.
I tend to use 95% bootstrap CIs, since they are more conservative than permutational tests and allow for comparisons. The gl.spatial.autoCorr function below will output 95% bootstrap CIs. You can try a permutation test by adding permutation = TRUE (note that for this option to work, you will need to create separate plots for each sex by choosing plot.pops.together = FALSE).
# Make 'sex' the population name
pop(simdat) <- simdat@other$ind.metrics$sex
# Add xy coordinates to the 'other' slot in genlight
simdat@other$latlon <- data.frame(lon = simdat@other$ind.metrics$x, lat = simdat@other$ind.metrics$y)
simdat@other$latlon <- Mercator(simdat@other$latlon, inverse = T)
# Run genetic spatial autocorrelation
gl.spatial.autoCorr(simdat, bins = seq(0, 50, 10), plot.pops.together = TRUE,
plot.colors.pop = c("#E69F00", "#5D3FD3"))
We can see that:
- Both females and males show significant positive spatial autocorrelation up to 30 m (where confidence intervals overlap zero). In other words, once you start comparing individuals that are 30 m apart, they are unlikely to be related/genetically similar (regardless of sex).
- Females have significantly stronger fine-scale genetic structure than males up to 10 m (i.e., a greater “r” value and female and male bootstrap CIs are non-overlapping).
These results are not surprising, since we set our distance classes based on the known (simulated) dispersal capacity of males and females, and we had large sample sizes (both individuals and loci). In reality, this analysis is a careful balance between power (maximising the sample size in each bin), and ensuring you are looking at the right scale (i.e., the size of the distance class).
What happens when you vary the dispersal distances? Can you pick up small differences between the sexes?
What happens if you change the number of individuals or SNPs? Does this influence the sensitivity of the analysis? Which is more important - more individuals or more loci?
What happens when you vary the size and number of distance classes? How does this interact with sample size? What do you think would happen if you used a single 50 m distance class - do we still see the signal of male-biased dispersal?
Try varying the following parameters:
Dispersal curves: “d0” and “p” when running the
p2pfunctionSample sizes: “Nind” (number of individuals) and “Nsnp” (number of loci) when running the
SimDispfunctionDistance classes: “bins” when running the
gl.spatial.autoCorrfunction
2a. Isolation-by-distance
Isolation-by-distance (IBD) is a key concept in population genetics. It describes a simple idea: that geographic distance can influence gene flow because individuals or populations that are geographically distant from each other are less likely to share genetic material than those that are close (based on the “stepping-stone” model; Kimura & Weiss 1964).
We can test for IBD with Mantel tests of matrix correspondence (Mantel 1967), implemented using the function gl.ibd. This test compares pairwise geographic and genetic distance matrices, using permutations to test for significance. For the latter, you can use pairwise population FST values or individual-by-individual genetic distances.
It’s easy enough to plot the relationship between geographic and genetic distance. However, we can’t use a standard regression to test this relationship because pairwise data are not independent. Mantel tests solve this problem by using permutation to test for significance (e.g., the data are randomly shuffled, and if there is no relationship, the result will be similar).
IBD tests often use pairwise population metrics like FST. This is useful when there are lots of samples and populations are well defined. However, there are often situations where we don’t have multiple samples per location, populations are difficult to define and/or individuals are more continuously distributed. In this case, we can use the individual as the unit of analysis.
Where FST estimates represent evolutionary averages, pairwise individual genetic distances may represent more contemporary patterns of dispersal. Therefore, the approach you take depends on the species, your sampling, and the question you want to ask.
Let’s test for IBD in our simulated data set:
total.ibd <- gl.ibd(simdat, distance = "euclidean", coordinates = "latlon")
# Show Mantel statistics
total.ibd$mantel
Mantel statistic based on Pearson's product-moment correlation
Call:
vegan::mantel(xdis = Dgen, ydis = Dgeo, permutations = permutations, na.rm = TRUE)
Mantel statistic r: 0.08552
Significance: 0.013
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.0512 0.0634 0.0767 0.0866
Permutation: free
Number of permutations: 999
There is significant IBD across all individuals, but it’s not very pronounced. Can we still see the signal of male-biased dispersal? Let’s create separate IBD plots for females and males:
f.ibd <- gl.ibd(simdat[simdat@pop == "F"], distance = "euclidean", coordinates = "latlon")
m.ibd <- gl.ibd(simdat[simdat@pop == "M"], distance = "euclidean", coordinates = "latlon")
# Show Mantel statistics for females
f.ibd$mantel
Mantel statistic based on Pearson's product-moment correlation
Call:
vegan::mantel(xdis = Dgen, ydis = Dgeo, permutations = permutations, na.rm = TRUE)
Mantel statistic r: 0.1438
Significance: 0.014
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.0828 0.1053 0.1302 0.1476
Permutation: free
Number of permutations: 999
# Show Mantel statistics for males
m.ibd$mantel
Mantel statistic based on Pearson's product-moment correlation
Call:
vegan::mantel(xdis = Dgen, ydis = Dgeo, permutations = permutations, na.rm = TRUE)
Mantel statistic r: 0.02018
Significance: 0.375
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.0825 0.1027 0.1199 0.1514
Permutation: free
Number of permutations: 999
Yes - we see significant IBD in females, since dispersal is more restricted. Males show no pattern of IBD, since many individuals can disperse right up to the spatial extent of our sampling. However, we lose some of the nuance from the previous analysis.
Mantel tests show us the broad trend across our study. IBD often acts as our null hypothesis (along with panmixia - i.e., random mating and dispersal) against which to test further landscape genetic analyses. However, unless the signal of sex-biased dispersal is very strong across the whole dataset, a Mantel test is unlikely to detect it. Spatial autocorrelation analysis is usually more powerful in detecting fine-scale genetic patterns than Mantel tests, because samples are “binned” and genetic structure is explored across multiple distance classes.
1. What happens if you use the proportion of shared alleles instead of euclidean distance?
In general, we expect a linear relationship between geographic (sometimes log-transformed) and genetic distance. A discontinuous relationship can be an indication that something else is going on. We will explore this idea below using a real example.
Lesser hairy-footed dunnarts in the Pilbara

Load the data
load("Data/Sy.gl.RData") # genetic dataSminthopsis youngsonii (or the lesser hairy footed dunnart), is a small carnivouros dasyurid found across the Australian arid region. As the name suggests, their hind foot is covered in short, bristly hairs, which are thought to help them navigate sandy soils. They are usually found on sand dunes, inter dune swale and red sand plains.
These samples were taken in the Pilbara at the very western edge of their range. Let’s take a look at a map:
leaflet(Sy.gl@other$latlong) %>%
addTiles() %>%
addCircleMarkers(lng = ~lon, lat = ~lat, popup = ~htmlEscape(lon))Given this is the edge of their range, and the Pilbara is quite rocky and doesn’t harbour much of the ideal habitat for this species, we want to know a little more about dispersal and connectivity in this area. A good first step is to test for isolation by distance.
Sy.ibd <- gl.ibd(Sy.gl, distance = "euclidean")
# Show Mantel statistics
Sy.ibd$mantel
Mantel statistic based on Pearson's product-moment correlation
Call:
vegan::mantel(xdis = Dgen, ydis = Dgeo, permutations = permutations, na.rm = TRUE)
Mantel statistic r: 0.6513
Significance: 0.001
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.0613 0.0897 0.1204 0.1594
Permutation: free
Number of permutations: 999
Yes - there is significant IBD! The pattern is very strong. But… some of the points don’t seem to follow the same slope? Why might this be?
There seems to be a big gap in sampling. This could be due to a sampling bias (e.g., the missing area is difficult to get to). Alternatively, the species may not occur here, suggesting there might be something going on with the habitat.
Could substrate be driving this sampling gap and the strange IBD pattern?
Let’s separate the individuals into two populations using longitude. You can click on the points in the map above to choose a location.
# Choose a point that separates the samples into two populations
lon.break <- 114.87
# Assign eastern and western populations based on this point
Sy.gl@pop <- as.factor(ifelse(Sy.gl@other$latlong$lon < lon.break, "1.West",
"2.East"))Now let’s take another look at IBD, but this time we’ll include the population information.
gl.ibd(Sy.gl, distance = "euclidean", paircols = "pop")
It looks like the magnitude of genetic distance is different and there are two different slopes describing IBD within each ‘population’. The points with large geographic and genetic distances represent among population comparisons. Discontinuities like this can sometimes suggest a barrier to gene flow. Let’s take a look at the substrate.
# Load a soil shapefile
soil <- st_read("Data/Soil.shp") # spatial data
# Create a colour palette for the two populations
pop.pal <- colorFactor(palette = c("#F8766D", "#00BFC4"), domain = Sy.gl@pop)
# Create a palette
soil.pal <- colorFactor(palette = c("#9C640C", "#5c3001", "#C0392B", "#EDC001",
"#FDEFB2"), domain = soil$Type)# Generate another map
leaflet(cbind(pop = Sy.gl@pop, Sy.gl@other$latlong)) %>%
addTiles() %>%
addPolygons(data = soil, weight = 0, fillOpacity = 0.7, color = ~soil.pal(Type)) %>%
addCircleMarkers(lng = ~lon, lat = ~lat, color = ~pop.pal(pop)) %>%
addLegendFactor(pal = soil.pal, shape = "rect", fillOpacity = 0.7, opacity = 0,
values = ~soil$Type, title = "Soil type", position = "topright",
data = gadmCHE, group = "Polygons")It looks like clay could be acting as a barrier. This is a good theory, but it would be useful to be able to visualise these patterns of isolation by distance spatially to get an indication of where exactly gene flow is being restricted. We’ll explore this idea in the next section.
2b. Isolation-by-barrier
Up to this point, we’ve tested for IBD - but could be helpful to visually inspect patterns of IBD across the landscape. We can do this with EEMS…
2c. Isolation-by-resistance
Up to this point, we’ve only visually inspected the landscape, could be useful to explicitly test the correlation between gene flow and landscape features.
Further Study
still to come…